A trainable algorithm for summarizing news stories

نویسندگان

  • Joel Larocca Neto
  • Alexandre D. Santos
  • Celso A.A. Kaestner
  • Alex A. Freitas
  • Julio C. Nievola
چکیده

This work proposes a trainable system for summarizing news and obtaining an approximate argumentative structure of the source text. To achieve these goals we use several techniques and heuristics, such as detecting the main concepts in the text, connectivity between sentences, occurrence of proper nouns, anaphors, discourse markers and a binary-tree representation (due to the use of an agglomerative clustering algorithm). The proposed system was evaluated on a set of 800 documents.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetic algorithm for summarizing news stories

This paper presents a new approach summarizing broadcast news using Genetic Algorithms. We propose to segment the news programs into stories, and then summarize stories by selecting from every one of them frames considered important to obtain an informative pictorial abstract. The summaries can help viewers to estimate the importance of the news video. Indeed, by consulting stories summaries we...

متن کامل

NewsInEssence: A System For Domain-Independent, Real-Time News Clustering and Multi-Document Summarization

NEWSINESSENCE is a system for finding, visualizing and summarizing a topic-based cluster of news stories. In the generic scenario for NEWSINESSENCE, a user selects a single news story from a news Web site. Our system then searches other live sources of news for other stories related to the same event and produces summaries of a subset of the stories that it finds, according to parameters specif...

متن کامل

Adaptive Representations for Tracking Breaking News on Twitter

Twitter is often the most up-to-date source for finding and tracking breaking news stories. Therefore, there is considerable interest in developing filters for tweet streams in order to track and summarize stories. This is a non-trivial text analytics task as tweets are short, and standard text similarity metrics often fail as stories evolve over time. In this paper we examine the effectiveness...

متن کامل

Feature Selection for Trainable Multilingual Broadcast News Segmentation

Indexing and retrieving broadcast news stories within a large collection requires automatic detection of story boundaries. This video news story segmentation can use a wide range of audio, language, video, and image features. In this paper, we investigate the correlation between automatically-derived multimodal features and story boundaries in seven different broadcast news sources in three lan...

متن کامل

Topic Models for Summarizing Novelty

We define temporal summaries of news stories as extracting as few sentences as possible from each event within a news topic, where the stories are presented one at a time and sentences from a story must be ranked before the next story can be considered. We outline an evaluation strategy that we have developed for this task and describe simple language models for capturing novelty and usefulness...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000